Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use selenium for web browsing #121

Closed
wants to merge 1 commit into from

Conversation

Agusx1211
Copy link

  • Allows for browsing dynamic pages (Twitter, Google, .etc).
  • It lets the agent read the whole page (paginated view)
    • Resuming can be added back later, but I think this performs better
  • I added "No images" to the base prompt, otherwise it tries to fetch images.
  • Selenium can later be used to further interact with the page (click buttons, fill forms).

@Durafen
Copy link

Durafen commented Apr 4, 2023

The AI appears to like paging and reading more detailed websites, however, I think there should be an option to obtain a get_text_summary of the whole webpage via prompt.

@Torantulino
Copy link
Member

@Agusx1211 Could you please resubmit this as an optional (using an argument) rather than replacing the existing code?

There have been some good improvements to browsing since this was submitted, but we're still looking for a major solution.

@Agusx1211
Copy link
Author

I can't work on this until late night. What do you mean as an optional? like adding a flag that changes the behavior?

@developerisnow
Copy link

I can't work on this until late night. What do you mean as an optional? like adding a flag that changes the behavior?

Yes, looks like he meant this. Cause it's major improvement which needs separate flag

@ryanmac
Copy link
Contributor

ryanmac commented Apr 5, 2023

See also PR #96 which uses Playwright instead of requests.

But I also second the request that this is triggered by an argument.

@sberney
Copy link

sberney commented Apr 6, 2023

I think this should be an alternate kind of command that the AI can run if the default (static) beautifulsoup parser fails.

@nponeccop
Copy link
Contributor

An important thing is that Selenium makes it harder to setup. And it somehow assumes that Chrome is the only Selenium web driver. So, for it to merge I guess (I'm not the maintainer) it should be an entirely optional feature with a separate document describing the installation, keeping the current "for dummies" installation procedure as is.

Copy link
Contributor

@nponeccop nponeccop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except the Chrome driver hardcoding. Support at least Firefox and Edge as well both in code and instructions.

And rebase against the current master.

Copy link
Contributor

@nponeccop nponeccop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolve the conflicts and split the prompt changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate the prompt changes into a different PR.

This was referenced Apr 10, 2023
@richbeales
Copy link
Contributor

Selenium is now in the master branch

@richbeales richbeales closed this Apr 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants